Skip to content

Conversation

@Hanjun-Dai
Copy link
Contributor

There are actor_model_config, reference_model_config and rollout_model_config where there value might conflict with each other, if user don't overwrite them all explicitly. The benefits are:

Consistency

Current
Screenshot 2025-12-20 at 3 38 54 PM
where user overwrites the model to qwen-0.6b but the rest of the configs still show llama-8b, which is very confusing.

After this patch
Screenshot 2025-12-20 at 2 00 35 PM

Reusing checks in cli/config.py

A lot of checks / validations are based on model_config. Right now it seems for RL the config is set through reference_model_config. So to make it consistent, we can overwrite the model_config when no explicit configuration are set to model_config.

Checklist

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and all unit tests pass.
  • I have added all appropriate doc-strings/documentation.
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have signed the Contributor License Agreement.
  • I have followed Contribution Guidelines.

@wang2yn84
Copy link
Collaborator

Thank you for your PR! Can you add the correspondent test cases to tunix/cli/config_test.py? Thank you!

@wang2yn84
Copy link
Collaborator

Also, can you provide the command to generate the "current" model config? Do you overwrite the model of the base config?

@Hanjun-Dai
Copy link
Contributor Author

Thank you for your PR! Can you add the correspondent test cases to tunix/cli/config_test.py? Thank you!

working on it!

@Hanjun-Dai
Copy link
Contributor Author

Also, can you provide the command to generate the "current" model config? Do you overwrite the model of the base config?

so the existing scripts should still work. If there's no direct modification to model_config (e.g., through cli), then model_config will take the value from the reference_model_config (reference_model_config is what's being used right now when setting up RL)

@Hanjun-Dai
Copy link
Contributor Author

closing this one as #960 is providing a better solution

@Hanjun-Dai Hanjun-Dai closed this Jan 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants